Streamlined enterprise accounting with a DaaS framework

Engineering

Streamlined enterprise accounting with a DaaS framework

HCLTech helped the client by implementing a transparent and accurate framework, improving data access in the client’s finance data lake.

5 min read

Introduction

Our client is a Fortune 50 conglomerate operating in several divisions, including aerospace, power, renewable energy, digital industry, additive manufacturing, venture capital and finance. They were looking to incorporate a subscription model that would enable users to subscribe to data sources to relieve the stress on the client’s finance data lake while continuing to accommodate ad hoc requests.

The Challenge

The high costs and potential liability associated with the prevalence of inaccurate information

The client was unsatisfied with their ability to serve internal customers’ data needs and track the costs associated with data requests. Their system also made it difficult to provide accurate information for the chargeback process that takes place after a company in the conglomerate accesses data, the costs of which are billed to the central corporate office. The client also faced difficulties in achieving improved data access, transparency and accuracy in intra-conglomerate financial affairs with a unified audit log.

The Objective

Implement a cloud native solution for accurate and transparent data access

The client wanted to achieve the following:

Implement a subscription model that would enable users to subscribe to various data sources and retrieve data from them according to a fixed schedule to relieve the stress on their finance data lake while continuing to accommodate ad hoc requests
Charging users for the costs associated with data requests by providing accurate data on transactions
The framework had to be based on cloud-native technology for flexibility and scalability, which meant abandoning the legacy ETL (Extract, Transform and Load) tool for the new solution

Streamlined enterprise accounting with a DaaS framework

The Solution

Streamlined, more transparent and accurate accounting

HCLTech proposed a list of technologies and approaches to create data as a subscription framework.

In line with the client’s requirements, the eventual solution was built on existing use cases but generic enough to easily accommodate new internal customers and demands
The technology included Apache Spark as the analytics engine, AWS Glue as the ETL service providing custom-tailored jobs and Amazon RDS for PostgreSQL as the database engine
Amazon S3 enabled measuring the amount of storage used for a request, while Glue made it easier to tag every job run with the name of the subscriber making the request and following the pricing of individual jobs, since each job that runs in Glue represents a separate instance
As key AWS-native technologies of the solution clearly indicated the costs associated with a request, this combination made it easy to calculate exact chargeback values

The Impact

Improved conglomerate-wide access to finance data lake

The flexible framework built by our team enabled the client to serve subscribers with the data they need from their FDL in a fully auditable, metadata-driven manner and calculate the exact costs associated with each request — whether scheduled or ad hoc.

This greatly streamlined the related accounting processes and made them more transparent, which helped the client avoid disputes about chargeback amounts
To promote cost savings in shared environment usage, whenever the number of subscribers reached critical mass, an update to the framework enabled moving part of the ingestion and the architecture to an EMR cluster
Another development was a metadata-driven “pub-sub” subscription model, which automatically approved requests into the metadata layer and started the feed to improve user experiences and make maintenance easier
Additionally, a mirror stream complementing the current data-model-based system enabled true live streaming of data, fulfilled requests quickly and granted access to data that was not a part of the data model